Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model

نویسندگان

Arash Sharifi Department of Computer Science, Science and Research Branch, Islamic Azad University, Tehran, Iran

Reza Ashrafidoost Department of Computer Science, Science and Research Branch, Islamic Azad University, Tehran, Iran

Saeed Setayeshi Amirkabir University of Technology, Tehran, Iran

چکیده مقاله:

Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance. This approach analyzes and tracks the emotional state changes trend of speaker during the speech. The proposed method classifies utterance emotions in six standard classes including, boredom, fear, anger, neutral, disgust and sadness. For this purpose, it is applied the renowned speech corpus database, EmoDB, for training phase of the proposed approach. In this process, once the pre-processing tasks are done, the meaningful speech patterns and attributes are extracted by MFCC method, and meticulously selected by SFS method. Then, a statistical classification approach is called and altered to employ as a part of the method. This approach is entitled as the LGMM, which is used to categorize obtained features. Aftermath, with the help of the classification results, it is illustrated the emotional states changes trend to reveal speaker feelings. The proposed model also has been compared with some recent models of emotional speech classification, in which have been used similar methods and materials. Experimental results show an admissible overall recognition rate and stability in classifying the uttered speech in six emotional states, and also the proposed algorithm outperforms the other similar models in classification accuracy rates.

Download for Free

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new image thresholding method based on Gaussian mixture model

Abstract: In this paper, an efficient approach to search for the global threshold of image using Gaussian mixture model is proposed. Firstly, a gray-level histogram of an image is represented as a function of the frequencies of gray-level. Then,to fit the Gaussian mixtures to the histogram of image, the Expectation Maximization (EM) algorithm is developed to estimate the number of Gaussian mixt...

متن کامل

a study on insurer solvency by panel data model: the case of iranian insurance market

the aim of this thesis is an approach for assessing insurer’s solvency for iranian insurance companies. we use of economic data with both time series and cross-sectional variation, thus by using the panel data model will survey the insurer solvency.

Mouth Region Localization Method Based on Gaussian Mixture Model

This paper presents a new mouth region localization method which uses the Gaussian mixture model (GMM) of feature vectors extracted from mouth region images. The discrete cosine transformation (DCT) and principle component analysis (PCA) based feature vectors are evaluated in mouth localization experiments. The new method is suitable for audio-visual speech recognition. This paper also introduc...

متن کامل

Refining Gaussian mixture model based on enhanced manifold learning

Gaussian mixture model (GMM) has been widely used for data analysis in various domains including text documents, face images and genes. GMM can be viewed as a simple linear superposition of Gaussian components, each of which represents a data cluster. Recent models, namely Laplacian regularized GMM (LapGMM) and locally consistent GMM (LCGMM) have been proposed to preserve the than the original ...

متن کامل

Speaker Clustering Based on Utterance-Oriented Dirichlet Process Mixture Model

This paper provides the analytical solution and algorithm of UO-DPMM based on a non-parametric Bayesian manner, and thus realizes fully Bayesian speaker clustering. We carried out preliminary speaker clustering experiments by using a TIMIT database to compare the proposed method with the conventional Bayesian Information Criterion (BIC) based method, which is an approximate Bayesian approach. T...

متن کامل

Eigenvoice conversion based on Gaussian mixture model

This paper describes a novel framework of voice conversion (VC). We call it eigenvoice conversion (EVC). We apply EVC to the conversion from a source speaker’s voice to arbitrary target speakers’ voices. Using multiple parallel data sets consisting of utterancepairs of the source and multiple pre-stored target speakers, a canonical eigenvoice GMM (EV-GMM) is trained in advance. That conversion ...

متن کامل

منابع من

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

عنوان ژورنال

Journal of Advances in Computer Engineering and Technology

دوره 3 شماره 2

صفحات 113- 124

تاریخ انتشار 2017-05-01

دنبال کردن

لغو دنبال کردن

{@ msg @}

با دنبال کردن یک ژورنال هنگامی که شماره جدید این ژورنال منتشر می شود به شما از طریق ایمیل اطلاع داده می شود.

کلمات کلیدی

speech processing emotional states pattern recognition mel frequency cepstral coefficient Gaussian mixture model

میزبانی شده توسط پلتفرم ابری doprax.com